首页> 外文OA文献 >Using Virtual Load/Store Queues (VLSQs) to Reduce the Negative Effects of Reordered Memory Instructions
【2h】

Using Virtual Load/Store Queues (VLSQs) to Reduce the Negative Effects of Reordered Memory Instructions

机译:使用虚拟加载/存储队列(VLSQ)减少重新排序的内存指令的负面影响

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The use of large instruction windows coupled with aggressive out-of order and prefetching capabilities has provided significantimprovements in processor performance. In this paper, we quantify the effects of increased out-of-order aggressiveness on a processor’s memory ordering/consistency model as well as an application’s cache behavior. We observe that increasing reorder buffer sizes cause less than one third of issued memory instructions to be executed in actualprogram order. We show that increasing the reorder buffer size from 80 to 512 entries results in an increase in the frequency of memory traps by a factor of six and an increase in total execution overhead by10–40%. Additionally, we observe that the reordering of memory instructions increases the L1 data cache accesses by 10–60% and theL1 data cache misses by 10–20%. These findings reveal that increased out-of-order capability can waste energy in two ways. First, re-fetching and re-executinginstructions flushed due to traps require the fetch, map, and execution units to dissipate energy on work that has already been done before. Second, an increase in the number of cache accesses and cache misses needlessly dissipates energy. Both these side effects can be related to the reordering of memory instructions. Thus, to avoid wasting both energy and performance, we propose a virtual load/ store queue (VLSQ) within the existing physical load/store queue. The VLSQ reduces the reordering of memory instructions by limiting the number of memory instructions visible to the select and issue logic.We show that VLSQs can reduce trap overhead, cache accesses, and cache misses by as much as 45%, 50%, and 15% respectively whencompared to traditional load/store queues. We observe that these reductions yield net power savings of 10–50% with degradation inperformance by 1–5%.
机译:大指令窗口的使用以及激进的乱序和预取功能已大大提高了处理器性能。在本文中,我们量化了乱序攻击的增加对处理器的内存排序/一致性模型以及应用程序的缓存行为的影响。我们观察到,增加的重排序缓冲区大小会导致少于三分之一的已发布内存指令以实际程序顺序执行。我们表明,将重排序缓冲区的大小从80个增加到512个,会导致内存陷阱的频率增加六倍,总执行开销增加10%至40%。此外,我们观察到对存储器指令的重新排序使L1数据高速缓存访​​问增加了10–60%,而L1数据高速缓存未命中则增加了10–20%。这些发现表明,乱序功能的增强会以两种方式浪费能量。首先,由于陷阱而刷新的重新取回和重新执行指令要求取回,映射和执行单元将能量消耗在以前已经完成的工作上。其次,高速缓存访​​问和高速缓存未命中次数的增加不必要地消耗了能量。这些副作用都可能与存储指令的重新排序有关。因此,为了避免浪费能源和性能,我们在现有的物理加载/存储队列中提出了虚拟加载/存储队列(VLSQ)。 VLSQ通过限制选择和发布逻辑可见的内存指令数量来减少内存指令的重新排序。我们证明VLSQ可以将陷阱开销,缓存访问和缓存未命中减少多达45%,50%和15与传统的加载/存储队列相比,分别为%。我们观察到,这些降低可节省10%至5​​0%的净功率,而性能下降1%至5%。

著录项

  • 作者

    Jaleel, Aamer; Jacob, Bruce;

  • 作者单位
  • 年度 2005
  • 总页数
  • 原文格式 PDF
  • 正文语种 en_US
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号